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Abstract 



A method for extracting maximal resolution power spectra from microwave 
sky maps is presented and applied to the 2 year COBE data, yielding a 
power spectrum that is consistent with a standard = 1, Qrms—ps ~ 

20 fiK 

model. 

By using weight functions that fall off smoothly near the galactic cut, it is 
found that the spectral resolution Ai can be more than doubled at ^ = 15 
and more than tripled at ^ = 20 compared to simply using galaxy-cut spher- 
ical harmonics. For a future high-resolution experiment with reasonable sky 
coverage, the resolution around the CDM Doppler peaks would be enhanced 
by a factor of about 100, down to A£ ~ 1, thus allowing spectral features 
such as the locations of the peaks to be determined with great accuracy. 
The reason that the improvement is so large is basically that functions with 
a sharp edge at the galaxy cut exhibit considerable "ringing" in the Fourier 
domain, whereas smooth functions do not. The method presented here is 
applicable to any survey geometry, chopping strategy and exposure pattern 
whatsoever. The so called signal-to- noise eigenfunction technique is found 
to be a special case, corresponding to ignoring the width of the window 
functions. 
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1 INTRODUCTION 



There has been a surge of interest in the cosmic microwave background 
radiation (CMB) since the first anisotropics of assumed cosmological origin 
were detected by the COBE DMR experiment (Smoot et al. 1992). On the 
experimental front, scores of new experiments have been carried out and 
many more are planned or proposed for the near future. On the theoretical 
front, considerable progress has been made in understanding how the CMB 
power spectrum Q depends on various cosmological model parameters - see 
Hu & Sugiyama (1995), Bond et al. 1994 and references therein for recent 
reviews of analytical and quantitative aspects of this problem. This aim of 
this paper is to strengthen the link between these two fronts, by presenting 
a method allowing more accurate estimation of the power spectrum from 
experimental data. 

The usual approach to power-spectrum estimation from CMB exper- 
iments has been to assume that some specific model is true, and use a 
likelihood analysis to find the model parameters that best fit the observed 
data. Since COBE is mainly sensitive to multipoles £ < 20, where many 
models predict a power spcctrvmi Ce that is well fit by a simple power law, 
most published work on the topic has focused on estimating merely two 
parameters; the overall normalization and a spectral index n, roughly the 
logarithmic slope of the power spectrum. This approach has since been gen- 
eralized to quite a variety of two-parameter models, for instance to a class of 
curved Q < 1 cosmologies (Gorski et al. 1995), where the model parameters 
are and the normalization, and to flat O < 1 cosmologies with a cosmo- 
logical constant (Bunn & Sugiyama 1995 - hereafter BS95; Stompor et al. 
1995) where the model parameters are the cosmological constant A and the 
normalization. The most elaborate such scheme to date is that of White 
&; Bunn (1995, hereafter WB95), using a three-parameter model. For the 
purpose of making such parameter fits efficiently, a number of sophisticated 
data analysis techniques have been invented, where the data is expanded 
in some set of basis functions that are orthogonal on the celestial sphere 
after the galaxy has been cut out. It has recently been shown (Tegmark &; 
Bunn 1995, hereafter referred to as TB95) that the orthogonalized spherical 
harmonics (Gorski 1994, hereafter G94) and the signal-to-noisc eigenmodes 
(Bond 1994 - hereafter B95, BS95) give results that are both virtually iden- 
tical and virtually optimal for parameter fitting, so that the choice of which 
method to use is mostly an issue of computational efficiency. 

However, as CMB data continues to improve in quantity and quality. 
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it becomes increasingly attractive to go beyond mere parameter fitting and 
break the shackles of model-dependence. Just as is routinely done with the 
power spectrum P{k) of galaxies, we wish to simply estimate the angular 
power spectrum without assuming any model, i.e., estimate the entire se- 
quence of multipole moments Ci from the data one by one, and so to say 
let the data speak for itself. To estimate say the first 30 multipoles in this 
fashion thus means fitting a 30-parameter model to the data. Given the 
numerical difficulties of fitting even three parameters at once, this is clearly 
unfeasible with the above-mentioned methods at the present time. 

Fortunately, this program can readily be carried through by a more direct 
approach, where the step of maximum-likelihood estimation is omitted, and 
the multipoles Ci are roughly speaking estimated by simply expanding the 
data in some modified versions of the spherical harmonics and then squaring 
the coefficients. Except for technicalities like shot-noise correction, this is 
analogous to the way that the matter power spectrum P{k) is estimated 
from galaxy redshift surveys: the observed density field is expanded in some 
functions that resemble plane waves, and then the expansion coefficients are 
squared. Pioneering work on this problem (Peebles 1973; Hauser & Peebles 
1973) has recently been extended and applied to the 2 year COBE data 
(Wright et al. 1994b, hereafter W94), and the result of such an analysis is 
shown in Figure |^ (top). When estimating power spectra, it is customary 
(White, Scott & Silk 1994) to place both vertical and horizontal error bars 
on the data points, as in Figure |l|. The former represent the uncertainty 
due to noise and cosmic variance, and the latter refiect the fact that an 
estimate of Ci inadvertently also receives contributions from other multipole 
moments. As is well-known, this unavoidable effect is caused by incomplete 
sky coverage, which destroys the orthogonality of the spherical harmonics. 
Another way of phrasing this is that any estimate of Ci will be the true power 
spectrum convolved with a window function that has some non-zero width 
A£. Here and throughout, we will refer to these horizontal bars Ai as the 
resolution of the power spectrum estimate. Poor resolution (large Ai) clearly 
destroys information by smearing out features in the true spectrum. For 
typical ground- and balloon-based experiments probing degree scales, this 
spectral blurring Ai/i tends to be of order unity, which makes it difficult to 
resolve details such as the number of Doppler peaks. A much better method 
is that presented in W94, where the relative resolution Ai/i is brought down 
to the order of 25% by using the COBE sky map. In this paper, we will see 
how to reduce the horizontal error bars still further, down to their theoretical 
minimum, which for the COBE data is seen to be a resolution Ai ~ 1. The 



2 



method is akin in spirit to that of Tegmark (1995, hereafter T95) for power 
spectrum estimation from galaxy surveys, in which the best way to weight 
the galaxies turns out to be given, surprisingly, by the Schodinger equation. 

This paper is organized as follows. The problem is formalized in Section ^ 
and solved in Section ^ for the most general case. The method is applied to 
the special case of the 2 year COBE DMR data in Section ^. In Section 
it is seen that, just as for the above-mentioned galaxy survey problem, the 
gist of the method can be understood from a simple analogy with quantum 
mechanics. The method is compared to other techniques in Section |6|, and 
its implications for future CMB experiments are discussed. 

2 THE PROBLEM 

Although the following three sections are rather technical in nature, focus- 
ing on the application to real data, we will see in Section |^ that the essence 
of both the problem and its solution have quite a simple and intuitive inter- 
pretation. 

Since the computations that follow are all linear algebra, vector and 
matrix notation will be used as extensively as possible. If the introduction 
of some notation is found too cursory, the reader is referred to TB95 for 
more details. Although the true CMB fluctuations are described by some 
smooth temperature distribution ^ over the celestial sphere, a real-world 
CMB sky map is always discretized into some finite number of pixels, say 
N of them. Let us write the CMB sky map as the A^-dimensional vector x, 
where 

AT 

Xi = — (fi)> (1) 

and f j is a unit vector in the direction of the i*'^ pixel. = 6144 for the 
all-sky COBE DMR map. 

For reasons that will become clear later, we define a new vector z = Ax, 
where A is some completely arbitrary N' x N matrix. Later on, we will see 
that considerable savings in computer time can often be made by choosing 
A wisely. Using equation (2) from TB95, we obtain 

z = A{Yai + e) (2) 

by choosing the matrix A to have all its row vectors orthogonal to the 
monopole and the dipole, thus eliminating the "nuisance contribution" from 
these multipoles. The N x cxD-dimensional spherical harmonic matrix Y is 
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defined as Yix = Y£m(fj), where we have combined £ and m into the single 
index A = £^ + £ + m + 1 = 1, 2, 3, .... (Throughout this paper, we use real- 
valued spherical harmonics, which are obtained from the standard spherical 
harmonics by replacing e*'"'^ by ^/2smm(j), 1, \/2cosm(/> for m < 0, m = 0, 
m > respectively.) Making the standard assumption that the CMB is 
Gaussian on the scales probed, a is an infinite-dimensional Gaussian random 
vector with zero mean and with the diagonal covariance matrix 

{a\a\') = Sxx'B^Ce, (3) 

where is the angular power spectrum and where i?| is the experimental 
beam function. The A^-dimensional noise vector e is assumed to be Gaussian 
with (e) = and {eiEj) = aiajSij (for the COBE case, see Lineweaver et al. 
1994), where o", is the rms noise of pixel i. 

We wish to estimate the power spectrum from z. As Ci tends to 
decrease rapidly with £, we will find it convenient to define 

D, = BjCfJfie (4) 

for some weights fj,i that make the coefficients D£ of similar magnitude, as 
was done in WB95. We will leave arbitrary for now, and estimate Di. 
Everything below refers to some definite multipole i*, say I* = 7, and is to 
be repeated separately for all other £*-values of interest. The most general 
estimate of Dp that is quadratic in the data can clearly be written as 

D = z^Ez, (5) 

where E is some arbitrary real symmetric N' x N' matrix. Let us expand 
it in an orthogonal set of eigenvectors {ejt} as 

N" 

E =Y1 (^k^k^k, (6) 
fc=l 

where 7^ (we simply omit vanishing eigenvalues from the sum, so 
that A^" is the rank of the matrix). Since D is by definition positive, we 
clearly want E to be positive semidefinite, i.e., > 0. Substituting equa- 
tions (§), (0) and (^ into (P), we obtain 

N" 

{D) = Y.^k 

k=l 

00 

= Y.^iD, + B, (8) 

e=2 



tr y*^*efcefe*yiy(aa*) + tr ^*efcefe*yl(ee*) 



(7) 



4 



where we have defined the window function 



N" £ 

vj^ii,Y.^k E (9) 

fe=l m=-i 

and the noise bias 

N" N 

B^E«^'E(^*efc)?^?' (10) 

k=l i=l 

and V = AY. This merely reflects well-known fact that the expectation 
value of any quadratic combination of pixels is just the power spectrum 
convolved with some window function, plus a positive noise bias. We define 
the noise-corrected estimate by 

f)Corr = f)_B. (11) 

We want the window function to represent a weighted average of the D^- 
coefficients, which requires it to be normalized so that 

oo 

Y.vj = l. (12) 

e=2 

Defining 

oo 

(/W) = E^'/W (13) 

for an arbitrary function /, we can write this normalization condition as 
simply (1) = 1. Ideally, we would want the window function to be v'^ = 6u*, 
but this is of course impossible to achieve if the sky coverage is incomplete. 
Given this limitation, we simply want the window function to be as narrow 
as possible, centered around i*. We may for instance choose to minimize 
its variance (second moment) about £*, {{£ — or more generally the 

quantity {\i — for some positive exponent z/. Let us allow for complete 
freedom of preferences by minimizing (pi) for a completely arbitrary penalty 
function pi. We thus arrive at the following optimization problem: 
Find the matrix E that minimizes (pi) subject to the constraint that (1) = 1. 

3 THE SOLUTION 

We will now solve this constrained minimization problem. It is easy to see 
that an optimal solution can always be found where the matrix E has rank 
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1, i.e., where it is of the simple form E = ee* for some vector e. Below we 
will find a set of N' different rank 1 matrices Gk^k which have the following 
properties: 

• They are independent in the sense that ((e^ • z)(efc/ • z)) oc S^k'- 

• k = 1 corresponds to an optimal solution, k = 2 to the second best 
solution, etc. 

If the only objective were to minimize the horizontal error bars in Figure |^, 
the optimal choice would thus he E = eiei*. However, we clearly also want 
to keep the vertical error bars as small as we can. Since cosmic variance drops 
as we average many different estimates, it is in general better to choose E 
to be some weighted average of the above-mentioned rank one matrices, as 
given by equation when normalizing the weights as ^k=i o^fc = 1- We 
clearly face a trade-off between vertical and horizontal error bars: as we 
increase N" , the vertical error bar decreases, but since we are including 
increasingly wide window functions, the horizontal error bar increases. For 
reasonable sky coverage except for a galactic cut, the choice of A^" turns out 
to be quite an easy one: the best {2i* + 1) rank one matrices clearly stand 
out in front of the rest of the pack, and all give quite similar error bars. 
Once we have fixed N" , changing the weights leaves the horizontal error 
bars fairly unaffected, and we can choose Ok so as to minimize the vertical 
error bars. In the limit of complete sky coverage, these {2i* + 1) best vectors 
ek approach the {2i* + 1) spherical harmonics in question, just as we would 
expect. 

Now let us find the vectors e^. Writing E = ekGk (no summation 
implied), we solve the constrained minimization problem for using the 
standard method of Lagrange multipliers. Defining 



and requiring the derivatives with respect to the components of to vanish, 
we obtain 




(14) 



(Q - \R)ek = 0, 



(15) 



where we have defined the N' x N' matrices 



TV 



Qij 



(16) 



A k=l 




(17) 



A 
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Here we have added the term ■^B to the target function to make the noise 
bias B (and thus the vertical error bars) small. The choice of 7 is discussed 
below. Since both Q and R are symmetric, this generalized eigenvahic prob- 
lem has N' real solutions, which wc normalize so that (1) = 1 and sort by 
their eigenvalues (the smallest eigenvalue corresponds to fc = 1, the best so- 
lution, etc.). Most standard eigenvalue packages (such as the public-domain 
package EISPACK, or that included in NAG) provide a specialized routine 
for precisely this problem: the generalized eigenvalue problem where Q and 
R are real and symmetric, and R is positive definite. 

Once we have solved this and chosen N" , the variance of our estimate is 

F(L>) = 2^^afeafc,MfeV, (18) 

k k' 

where the matrix M^ki = ((e^ • z){ek' • z)) is given by 

N 

Mkk' = ^(l/*efc)A(F*efc,)AW^^ + J^i'^'^kUA'ek'Uf. (19) 

A i=l 

When M is almost diagonal, the variance is approximately minimized if we 

choose the weights ak oc Mj7^. It is easy to show that the first term will be 
strictly diagonal. The second term will typically be close to diagonal, with 
a slight off-diagonality caused by heteroscedastic noise. 

3.1 How to chose the peirameters 

The above treatment was very general, leaving the choice of the matrix A, 
the weights the penalty function p£ and the constant 7 to be chosen 
according to the preferences of the user. We now discuss some aspects of 
the choice of these free parameters. 

The matrix A. A is conveniently constructed in two steps: 

1. One makes some natural choice such as the spherical harmonic matrix 
y*, the noise- weighted and orthogonalized ditto introduced by G94, 
the time-saving matrix described in the next section, or the identity 
matrix if one prefers to simply use the pixel values for a brute-force 
approach a la TB95, obtaining the best possible result at a high com- 
putational cost. 

2. One makes all the row vectors orthogonal to the monopole and the 
dipole, as described in e.g. TB95. 
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The matrix A also allows us to directly incorporate data sets other than 
sky maps. For example, for a double-chop configuration such as that of the 
Tenerife experiment, the observed data vector z is a certain linear combina- 
tion of the actual sky pixels x (which are never observed directly) , and since 
the coefficients in these linear combinations (the elements of A) are known, 
the matrix A incorporates all we need to know about the data acquisition 
process. 

The weights This is basically the choice of what to label the verti- 
cal axis with in Figure |l|. If we probe a linear function with a symmetric and 
correctly centered window function, the estimates will be unbiased. How- 
ever, if we are probing the power spectrum Ce, which is generally believed 
to have convex shape (the second derivative of say !/£{£ + 1) is positive), 
then our estimates tend to be biased high, since the upward bias from cou- 
pling to lower multipoles (the so called "red leak") is stronger than the 
downward bias from coupling to higher multipoles (the "blue leak"). For 
instance, when attempting to estimate C20, even a very slight sensitivity to 
the quadrupole C2 is likely to dwarf the signal one is trying to measure, as 
pointed out in e.g. W94 and illustrated in Figure ^. A simple strategy for 
reducing such problems is to choose //^ proportional to the expected power 
spectrum multiplied by the experimental window function, i.e., 



so that the coefficients Di will all be of the same order of magnitude. This 
is quite a natural choice also because power spectra are conventionally plot- 
ted with the quantity £{i + 1)Ci/2tt oc Di on the vertical axis. Thus the 
horizontal error bars will refer to exactly the quantities that we plot, i.e., 
D£, which is of course the most honest way to present the results. 

The penalty function p. The simple choice pe = {£ — i*)'^ is quite 
adequate if fj,£ is a mere constant. This gives pcfj^c ~ as ^ ^ cxd, and cor- 
responds to the quadratic penalty function = A;^ as ^ ^ 00 in the galaxy 
survey problem treated in T95. In T95, it was seen that this k"^ penalty 
in Fourier space corresponded to a Laplace operator in real space, and 
that this guaranteed that the resulting solutions were continuous and well- 
behaved. A k'^ term would make the solutions even smoother, guaranteeing 
that all derivatives would be continuous as well, etc. If pipi asymptoti- 
cally grows slower than however, high frequencies in the weight function 
are not sufficiently suppressed, and there is a risk that the solutions will be 
quite irregular and unnatural. Since we advocated choosing /i/ = Bj /£{£+!) 




Bj 



(20) 
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above, we thus need to make pi —>■ oo at least as fast as i'^/Bj as £ — > oo. In 
addition, if one is particularly worried about say non-cosmic contamination 
from a particular multipole like the quadrupole, one can increase the value 
of p2, thereby reducing the value of the window function at i = 2. There 
are of course no free lunches: the price of reducing in this fashion is that 
other unwanted window function entries increase. 

The constant 7. Since there is a trade-off between vertical and hori- 
zontal error bars, we minimize a combination of them. The larger we make 
7, the greater the emphasis on the vertical ones. A larger 7 tends to increase 
the weights given to the least noisy pixels. As long as the pixel noise levels 
are of the same order of magnitude all across the map, the results are quite 
insensitive to the choice of 7, and one might as well set 7 = 0. 

It should be stressed that these free parameters reflect a strength rather 
than a weakness of the method. The goal is to find optimal window func- 
tions, and the choices oi fi, p and 7 simply reflect what we mean by "good". 
Different choices give slightly different error bars on the resulting plots, but 
even if one criterion for "good" is used in the optimization process and a 
different one is used when judging the results, the optimization tends to 
improve the situation greatly over simply using spherical harmonics. In 
Section H, we will see why. 

3.2 Analyzing very large data sets 

When applying this method to COBE, where the matrices are never larger 
than 4038 x 4038, the solution of the resulting eigenvalue problem is numer- 
ically straightforward. Future high resolution CMB experiments, however, 
are likely to produce data sets where the pixels are counted in millions rather 
than thousands. Although it is obviously not feasible to diagonalize matrices 
of this size, the method presented above can nonetheless be readily applied 
to such data sets, as follows: 

• For small I, almost no information is lost if the number of pixels is 
reduced by smoothing prior to the analysis. 

• For large i, it is convenient to split the sky into many smaller patches, 
analyze them separately, and then combine the results. This of course 
lowers the spectral resolution somewhat, but as is discussed in Sec- 
tion III, A£/£ can still be kept extremely small, of the order of a few 
percent. 
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4 APPLICATION TO COBE 



We now apply the method to the two year COBE DMR data, combining 
the 53 and 90 GHz channels and removing all pixels less than 20° away from 
the galactic plane. This leaves = 4038 pixels. 

4.1 Pcirameter choices 

Just as in WB95, we choose //^ = Bj/l{i + 1), where Bj is the COBE 
beam window function published by Wright et al. (1994a). In line with our 
discussion above, we estimate D^* by chosing the penalty function 



This behaves as a simple quadratic penalty function p£ (x {£ — l*)"^ near 
whose symmetry ensures that the "red leak" will be similar to the "blue leak" 
and thus that (£) ~ but also has the above-mentioned desired property 
that Piiii oc £^ as £ ^ oo. As the COBE signal-to-noise is quite good and 
the pixel noise does not vary radically across the sky, we make the simple 
choice 7 = 0. As the noise bias B is of order = (70/xK)^£(^-F 1)/NB^ and 
the optimal horizontal error bars Ai are of order unity, a natural alternative 
choice would be say 7 = 1/55*. This choice would mean roughly that we 
value a 1% reduction in the horizontal error bar as much as a 5% reduction 
in the vertical error bar. We choose A to be the first 961 rows of Y*, which 
corresponds to the spherical harmonic components up to £ = 30. Wc place 
the cutoff at £ = 30 simply because the COBE-beam effectively washes 
out the signal from £ S> 20, leaving mostly noise at these high frequencies. 
Indeed, by comparing G94 and TB95, one concludes that z contains almost 
all the cosmological information in x with this high frequency cutoff. We 
then throw out the first four rows of A, corresponding to the monopole and 
dipole, and make the remaining rows orthogonal to them. 

4.2 A trick for saving time 

Because the galactic cut preserves reflection symmetry about the galactic 

plane, even and odd multipolcs remain orthogonal to one another (to very 
good accuracy, the only slight correction arising from pixelization effects). 
When estimating the Di corresponding to an even £, we thus throw out all 
rows corresponding to odd £, and vice versa. 




(21) 
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Although diagonalizing 500 x 500 matrices (A^' being of order 961/2) is 
numericahy straightforward, the solution of the eigenvalue problem given 
by equation (|l5|) can be simplified further. Since the galactic cut preserves 
azimuthal symmetry, spherical harmonics with different m-values are also 
orthogonal to each other, so by simply regrouping the rows of A by m-values, 
the matrices Q and R become block-diagonal. In other words, we can do 
the following: 

1. Pick some m- value and choose A to be the rows of that are not 
orthogonal to the row corresponding to Yf^m- For say m = and I* 
even, A will consist of the N' = 14 rows corresponding to l2,m) Y4,m, 

2. Compute Q and R, truncating the sums (ID and at say £ = 60, 
since the COBE beam exponentially suppresses multipoles i ^ 15. 



3. Find the smallest eigenvalue of equation (|15|), normalize the corre- 
sponding eigenvector so that (1) = 1 and denote it e^- 

4. Repeat the corresponding procedure for the other m-values, finally 
obtaining the N" = 2t + 1 vectors {e-m}. 

5. Compute the matrix M (as with all other methods, error bar estima- 
tion of course requires assuming some reasonable power spectrum — 
here the spectrum n = 1, Qrms-ps = 20/iK is used) and the optimal 
weights Om- 

6. Compute W^"^'^ and the vertical and horizontal error bars. 

7. Repeat everything for all other ^*-values of interest. 
This is what has been done to produce Figure g (bottom). 



4.3 Results 

As expected. Figure || (bottom) shows the vertical error bars being domi- 
nated by cosmic variance for low I and by noise for large i, especially for 
i ^ 15, where beam smearing takes its heavy toll. As to the horizontal 
error bars, the bars that specify the spectral resolution, we see that they 
exhibit hardly any ^-dependence — we will return to this fact in the next 
section, and see that it has a simple intuitive explanation. Having said this 
about the error bars, we now turn to the location of the data points. If the 
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true power spectrum Cg were a simple n = 1 Harrison-Zel'dovich spectrum 
with normalization Qrms-ps = 20^K, then we would expect about 68% of 
the data points in Figure |^ to lie within the shaded region, which is the 
1 — 0" error region for this model, and this is clearly the case, in agreement 
with previous analyses such as W94 and de Oliveira-Costa & Smoot (1995). 
Comparing our results to those obtained with the truncated spherical har- 
monic method (top), we see that the individual multipole estimates agree 
quite well for low i, where the spectral resolution Ai of the two methods 
is comparable, but start to differ as i increases, reflecting the fact that the 
data points in the upper figure are probing the average power in a much 
wider band of ^-values. 

For verification, 1000 Monte-Carlo simulations were performed, where 
fake COBE-skies with noise and galaxy cut were generated with n = 1, 
Qrms-ps = 20/uK and then piped though the analysis software. The result 
was in perfect agreement with the theoretically predicted error bars, with 
zero bias, about 68% of the estimated multipoles falling within the shaded 
region in Figure |l], etc. The COBE analysis was also repeated for a number 
of different choices of the parameters 7, /i and p, and the results were found 
to remain virtually unaffected. In other words, the results obtained with 
the method we have presented are quite robust. 

5 THE QUANTUM ANALOGY 

Whereas sections ^, |3| and ^ of this paper were rather technical, this section 
shows that all qualitative features of the method can be understood from 
a simple analogy with quantum mechanics. Although all the elements of 
realism included above are important when applying the method to real 
data, one can in fact understand the qualitative features by ignoring most 
of these complications. 

First of all, let us ignore the fact that a real CMB sky map is pixelized. 
We let the function x(r) denote the temperature fluctuation AT/T in the 
direction of the unit vector f . In the discrete real-world case, each pixel 
has some r.m.s. noise variance cj^ and effectively covers some solid angle 
of the sky. For our continuous analogue, let us define the noise variance 
function V(r) by V = a^^l, where a and O refer to a pixel in the direction r. 
If we neglect contamination problems, it is easy to see that V(r) is simply 
proportional to the inverse of the time spent observing the direction r (the 
small-scale details of the pointing of the antenna are irrelevant, as the beam 
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smearing will ensure that the function V is smooth on angular scales below 
the beam width). Secondly, let us ignore the nuisance terms from monopole, 
dipole, etc., so that we can omit A from equation (^). Finally, we choose 
fj.i = 1. 

Since the power Di* has units of //K^, we clearly want to estimate it by 
some quantity D that is quadratic in the data. The simplest estimator of 
this type is 

(22) 

where ip is some function on the sphere, and above we showed that the most 
general estimate is just a weighted average of such estimates. A straightfor- 
ward calculation shows that 

oo ^ „ 

(^) = E E \i'in.\'De+ / m?)\^V{i)dn, (23) 
e=Om=-i •' 

where ipim denotes the spherical Fourier transform of ip, i.e., the coefficients 
in an expansion of ip in spherical harmonics. In other words, we see that the 
expectation value of our estimator is the sum of two terms of quite different 
character. The first, the contribution from cosmology, is the power spectrum 
convolved with a window function = IV'/mP- The second, the contri- 
bution from noise, is just an average value of V, the weights being | •(/'(?) p. 
This is very similar to the result when estimating the power spectrum from 
a galaxy survey (T95). Just as in that paper, we will find it convenient 
to use the standard Dirac quantum mechanics notation with kets, bras and 
linear operators. This allows us to write 'ipim = {i^m\^lJ). A window function 
should always integrate to unity, so the correct normalization for tp is just 
= 1. Defining the operator 

oo i 

L = J2Y1 ¥m){im\, (24) 

e=Om=-i 

equation (^) becomes simply 

{D) = {^\DL + V{^m. (25) 

Note that L is a scalar operator satisfying L\im) = i\im), and is related 
to the (vector) angular momentum operator L = —ir x V through = 
L{L + 1). 



D 
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Now what is the best choice of ^p? Equation ( [2^ ) tells us that D is 
the square modulus of a random variable whose real and imaginary parts 
are both Gaussian. Thus if V is real, the standard deviation of D (the 
vertical error bar) is simply ^/2 times its expectation value. (With random 
phases, the real and imaginary parts contribute equally, and this decreases 
the variance by a factor of \/2-) This means that we minimize the vertical 
error bars by minimizing (D). But assuming that our window function is 
narrow enough that we are measuring mostly what we want to measure, 
Di*, the first term in equation (|25| ) satisfies {iI^\Dl\iI;) « D£*, independent 
of V', so we minimize the vertical error bars by simply minimizing the second 
term, {iplV (r)\ijj) . As a measure of the horizontal error bars, we will use Ai, 
the r.m.s. deviation of the window function from i* . With our quantum 
notation, we have simply = {ip\{L — £*)'^\Tp) . As was discussed above, it 
is impossible to minimize both error bars at the same time, since there is a 
trade-off between them. It would be like asking for the best and cheapest car. 
Instead, the best we can do is minimize some linear combination E = {H), 
where we have defined 

H^{L-rf + jV{?), (26) 

and the parameter 7 specifies how concerned we are about the vertical error 
bar relative to the horizontal one. Continuing our quantum analogy, we 
see that we want to find the ip that minimizes the total "energy", where 
the "kinetic energy" (L — £*)'^ corresponds to the horizontal error bar and 
the "potential energy" 'yV(r) corresponds to the vertical error bar. If we 
for the sake of illustration set i* = —1/2, we simply want to minimize 
(V'jL^ + "fV {r)\ijj) , given the constraint {^\ip) = 1. Introducing a Lagrange 
multiplier E, we arrive at the Schrodinger equation 

[■L^ + jV{^)M=E\i^). (27) 

In other words, we want to find the ground state wavefunction for a particle 
confined to a sphere with some potential. Numerically solving for various 
integer values of i* gives functions with similar behavior, modulated by a 
wiggling similar to that of the corresponding spherical harmonics. 

From our knowledge of quantum mechanics, we can immediately draw a 
number of conclusions about the solutions, all which turn out to agree well 
with the exact numerical results. 

• l^/^p will be small in regions where the noise variance is large, so regions 
that received little observation time will receive low weights in the 
analysis. 
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Except for the case of complete sky coverage, we will have > 0. 

If incomplete sky coverage confines ijj to a, region of the sky whose 
angular diameter in the narrowest direction is of order A^, then the 
uncertainty principle tells us that the minimum A£ must be at least 
of order 1/ A9. 

This limit on the spectral resolution is independent of i (that this 
remains approximately true in the real-world case as well is illustrated 
in Figure |^ . 

For a sky map of COBE type, where A9 is of order a radian given 
a 20° galactic cut, the uncertainty principle thus gives Ai > 1. This 
agrees well with the horizontal error bars actually attained in Figure 
1 (bottom). 

If the sky-coverage is incomplete, V is infinite outside of the region cov- 
ered, and we recover the quantum-mechanical particle-in-a-box prob- 
lem. From this we know that tp will always go to zero smoothly as 
it approaches the survey boundary. This is illustrated in Figure ^ 
(bottom) by a numerical example. 

This smoothness of ■0 is really the gist of the method, as it radically re- 
duces "ringing" in Fourier space, "kinetic energy" , without increasing 
the "potential energy" {iplV {r)\ip) much at all. This is seen by com- 
paring figures § and ^: whereas the upper and lower weight functions 
look quite similar in real space (Figure |2|) , they differ dramatically in 
the Fourier (multipole) domain (Figure y). 

If the noise level V is constant wherever we have data, then the po- 
tential energy term will reduce to 'y{tp\V\ip) = ^V{ip\Tp) = , i.e., 
become independent of ij). This means that the solution will be inde- 
pendent of 7. For any evenly sampled data set, the choice of 7 is thus 
irrelevant, so we might as well chose 7 = for simplicity, as we did in 
our COBE analysis. 

If the survey volume consists of several disconnected parts, then Ai 
is limited by the AO of the largest part. For the galaxy-cut COBE 
case, for instance, using only the northern half of the sky gives al- 
most the same Ad. as using both the northern and southern skies com- 
bined. (However, including both of course helps reduce the vertical 
error bars.) 
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6 DISCUSSION 



A method for extracting maximal resolution power spectra from CMB sky 
maps has been presented and applied to the 2 year COBE DMR data. 

6.1 COBE results 

The power spectrum extracted from the 2 year COBE data in Figure | 
(bottom) is seen to be consistent with the standard n = I, Qrms-ps = 20 
model. This model is close to the best-fit models found in the various two- 
parameter Bayesian likelihood analyses published {e.g., G94, B95, BS95, 
TB95, WB95), which we can interpret as the best-fit straight line through 
the data points in Figure || being close to the horizontal heavy line. As 
has frequently been pointed out (see e.g. WB95), Bayesean methods by 
their very nature can only make relative statements of merit about different 
models, and never address the question of whether the best-fit model itself is 
in fact inconsistent with the data. As an absurd example, the best fit straight 
line to a parabola on the interval [—1, 1] is horizontal, even though this is a 
terrible fit to the data. It is thus quite reassuring that the power spectrum 
in Figure m (bottom) not only has the right average normalization and slope, 
but that each and every one of the multipoles appear to be consistent with 
this standard best fit model. 

6.2 How the method differs from other techniques 

A number of other linear techniques for CMB analysis have recently been 
published and applied to the COBE data. Both the Karhunen-Loeve (KL) 
signal-to-noise eigenmode method (B95, BS95) and the orthogonalized spher- 
ical harmonics method (G94) are devised to solve a different problem than 
that addressed here. If one is willing to parametrize the power spectrum by a 
small number of parameters, for instance a spectral index and an amplitude, 
then these methods provide an efficient way of estimating these parameters 
via a likelihood analysis. Why cannot the basis functions of these methods 
be used to estimate the angular power spectrum Ci directly, as they are 
after all orthogonal over the galaxy-cut sky? The answer is that these ba- 
sis functions are orthogonal to each other, whereas in our context, we want 
them to be as orthogonal as possible not to each other but to the spherical 
harmonics. This is illustrated in Figure H], which contrasts i* = 20 window 
functions of the optimal method and the generalized Hauser-Peebles method 



16 



(W94, de Oliveira-Costa & Smoot 1995). We want the window function to 
be centered on I* = 20, and be as narrow as possible, so the lower one is 
clearly preferable. Equation (|9|) shows that the window function will vanish 
for a given i if the weight function is orthogonal to all spherical harmonics 
with that value, so we can interpret Figure ^ as the optimal weight func- 
tion being essentially orthogonal to all spherical harmonics except i = 18, 
i = 20 and i = 22 (the reason that i = 19 and ^ = 21 do not cause a problem 
is that even and odd multipoles remain orthogonal after the galactic cut, 
since parity symmetry is preserved). The other weight function is seen to 
couple strongly to many of the lower multipoles, and picks up a contribution 
from the quadrupole that is even greater than that from i = 20. This of 
course renders it quite inappropriate for estimating the power at i = 20. 
Analogous window functions can of course be computed for the orthogo- 
nalized spherical harmonics of G94 or the signal-to-noise eigenmodes of B95 
and BS95, and they also exhibit window functions that are broader than the 
optimal one in Figure ^ — our derivation of the optimal weight functions of 
course guarantees that any other basis functions will give broader window 
functions. However, it should be emphasized that generating such window 
functions for the basis functions of G94, B95 and BS95 would be quite an 
unfair criticism of these methods, since this would be grading them with re- 
spect to a property that they were not designed to have. These authors have 
never claimed that their basis functions were optimal for multipole estima- 
tion, merely (and rightly so) that they were virtually optimal for parameter 
fitting with a likelihood analysis. 

The data points in Figure |l] are placed at the mean ^-values obtained 
from histograms like that in Figure ^, and as mentioned, the horizontal error 
bars are simply the r.m.s. widths of the histograms about this ^-value. The 
horizontal error bars resulting from the optimal method are contrasted with 
those obtained by the generalized Hauser-Peebles method in Figure ^ It 
is seen that the former gives error bars Ai ~ 1, whereas the latter gives 
a resolution A£ that increase strongly with i. The new method is seen to 
more than double the spectral resolution at ^ = 15 and more than triple 
it at ^ = 20. At i = 300, the gain would be about a factor of 100 for a 
high-resolution all-sky map with a 20° galactic cut. 

As the low multipoles Cg are intrinsically much larger, histograms like 
that of Figure ^ exhibit mainly leakage from the left, from lower multi- 
poles. Thus the generalized Hauser-Peebles method probes an effective £- 
value {£) < I for large as pointed out in W94. It is seen that the new 
method eliminates this "read leak" problem. 
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There is quite an interesting connection between this method and the 
KL signal-to-noise eigenmode technique of B95 and BS95. The KL-method 
is that which optimizes the signal-to-noise ratio, i.e., loosely speaking that 
which minimizes the vertical error bars without regard for the horizontal 
ones. Indeed, by taking 7 — > oo (which corresponds to dropping the first 
term in equation (0)) and choosing //^ to be the expected power spectrum, 
it is readily seen that the present method reduces to the KL method. In 
other words, the KL method is a special case of the method presented here, 
corresponding to ignoring the spectral resolution. 

6.3 Qualitative features of the method 

Figure Q can be easily understood from the analogy with quantum mechanics 
discussed in the previous section. Consider a free particle constrained to the 
surface of a sphere. If its wavefunction is required to vanish in some region, 
then it cannot be in an angular momentum eigenstate, and the Heisenberg 
uncertainty principle tells us that Ai must be bounded from below by some 
positive constant Ai^,. Which wavefunction minimizes Ai given this con- 
straint? Basically, does. So when the forbidden region is a 20° strip above 
and below the equator, A£^ corresponds to half the height of the double- 
shaded strip. If the wave- function is abruptly truncated at the edge of this 
strip (as is the case if we simply use spherical harmonics or orthogonal- 
ized versions thereof), this sharp edge causes considerable ringing in Fourier 
space, which is why the single-shaded region is so much wider. Like the 
Gaussian minimum-uncertainty states, the optimal set of orthogonal func- 
tions Cfc fall off smoothly rather than abruptly as they approach the galactic 
cut. This difference is readily seen in Figure ^. Although the difference 
between the naive and optimal weight functions may appear minor in real 
space (Figure they are seen to be quite radical in the Fourier (multipole) 
domain (Figure |3|) . 

6.4 Imphcations for future experiments 

The fact that our method produces a spectral resolution Ai of order 1/A6, 
independent of i, where AO is the smallest survey dimension, has quite 
encouraging implications for future CMB experiments. It means that a 
high-resolution satellite experiment covering a square radian on the sky can 
produce Ai 1 all the way out to the Doppler peaks, thus allowing very 
precise determination of cosmological parameters. If a long-duration balloon 
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flight (or a satellite map with dusty regions removed) produces a map of a 
10° X 10° patch of sky, then A£ ^ 6 and the location of a Doppler peak 
at say i = 200 could be pinpointed to an accuracy A£/i of a few percent. 
When designing such future CMB experiments, it should be borne in mind 
that since the smallest dimension is what limits the resolution, a square map 
is always superior to a rectangular one of the same area. 

The author wishes to thank Ted Bunn, George Efstathiou, Carlos Frenk, 
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Figure 1: The COBE power spectrum before and after optimization. 
The observed multipoles Di = i{i + l)Ci are plotted with 1 — a error bars. 
The vertical error bars include both pixel noise and cosmic variance, and 
the horizontal error bars show the width of the window functions used. If 
the true power spectrum is given by n = 1 and Qrms-ps = 20fiK (the 




Figure 2: Weight functions before and after optimization. 
The weight function e used to estimate the muhipole f. = 20, m = is 
plotted in Hammer-Aitoff projection in galactic coordinates. Black pixels 
receive large positive weight, white pixels receive large negative weight, and 
the grey shade of the galactic cut corresponds to zero weight. The func- 
tions shown are the relevant galaxy-cut spherical harmonic (top) and the 
eigenfunction e with the smallest eigenvalue (bottom). 
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Figure 3: Window functions before and after optimization. 
Two window functions for estimation of the multipole i = 20, m = 
are shown. The upper one is that of the spherical harmonic method 
(W96), which exhibits a strong leakage from lower multipoles such as the 
quadrupole. The lower one is the one resulting from the optimal method. 
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Figure 4: Horizontal error bars before and after optimization. 

The shaded region shows the range of values {i)zizAi probed by the window 
function devised to estimate Dj(, the heavy line showing the mean {i). The 
wide, jagged region is the result of using spherical harmonics, whereas the 
narrow strip results from the optimal method described. 
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